Grafting for Combinatorial Boolean Model using Frequent Itemset Mining

نویسندگان

  • Taito Lee
  • Shin Matsushima
  • Kenji Yamanishi
چکیده

Œis paper introduces the combinatorial Booleanmodel (CBM), which is defined as the class of linear combinations of conjunctions of Boolean aŠributes. Œis paper addresses the issue of learning CBM from labeled data. CBM is of high knowledge interoperability but naı̈ve learning of it requires exponentially large computation time with respect to data dimension and sample size. To overcome this computational difficulty, we propose an algorithm GRAB (GRA‰ing for Boolean datasets), which efficiently learns CBM within the L1-regularized lossminimization framework. Œe key idea ofGRAB is to reduce the loss minimization problem to the weighted frequent itemset mining, in which frequent paŠerns are efficiently computable. We employ benchmark datasets to empirically demonstrate that GRAB is effective in terms of computational efficiency, prediction accuracy and knowledge discovery.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Vertical Mining Using Boolean Algebra

The vertical association rules mining algorithm is an efficient mining method, which makes use of support sets of frequent itemsets to calculate the support of candidate itemsets. It overcomes the disadvantage of scanning database many times like Apriori algorithm. In vertical mining, frequent itemsets can be represented as a set of bit vectors in memory, which enables for fast computation. The...

متن کامل

Computational Intelligence in Data Mining

In this paper, we propose a method of finding simple disjoint decompositions in frequent itemset data. The techniques for decomposing Boolean functions have been studied for long time in the area of logic circuit design, and recently, there is a very efficient algorithm to find all possible simple disjoint decompositions for a given Boolean functions based on BDDs (Binary Decision Diagrams). We...

متن کامل

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

Ramp: High Performance Frequent Itemset Mining with Efficient Bit-Vector Projection Technique

Mining frequent itemset using bit-vector representation approach is very efficient for small dense datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a novel efficient bit-vector projection technique, for sparse and dense datasets. We also present a new frequent itemset mining algorithm Ramp (Real Algorithm...

متن کامل

Accelerating Closed Frequent Itemset Mining by Elimination of Null Transactions

The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1711.02478  شماره 

صفحات  -

تاریخ انتشار 2017